Optimal target search on a fast folding polymer chain with volume exchange 
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We study the search process of a target on a rapidly folding polymer ('DNA') by an ensemble 
of particles ('proteins'), whose search combines ID diffusion along the chain, Levy type diffusion 
mediated by chain looping, and volume exchange. A rich behavior of the search process is obtained 
with respect to the physical parameters, in particular, for the optimal search. 

PACS numbers: 05.40.Fb,02.50.-Ey,82.39.-k 



Introduction. Levy flights (LFs) are random walks 
whose jump lengths x are distributed like X(x) ~ 
with exponent < a < 2 [j. Their probability den- 
sity to be at position x at time t has the characteristic 
function P(q,t) = f^ oo e i i x P(x,t)dx = exp (~D L \q\ a t), 
a consequence of the generalized central limit theorem 
0; in that sense, LFs are a natural extension of nor- 
mal Gaussian diffusion (a = 2). LFs occur in a wide 
range of systems 0; in particular, they represent an op- 
timal search mechanism in contrast to locally oversam- 
pling Gaussian search Q. Dynamically, LFs can be de- 
scribed by a space-fractional diffusion equation dP/dt = 
Di,d a P(x,t)/d\x\ a , a convenient basis to introduce addi- 
tional terms, as shown below. Dl is a diffusion constant 
of dimension cm Q /sec, and the fractional derivative is 
denned via its Fourier transform, ^{d a P(x, t)/d\x\ a } — 
— \q\ a P(q,t) 0. LFs exhibit superdiffusion in the sense 
that (|a;| c ) 2/c ~ {D L t) 2 / a (0 < C < a), spreading faster 
than the linear dependence of standard diffusion (a = 2). 

A prime example of an LF is linear particle diffusion 
to next neighbor sites on a fast folding ('annealed') poly- 
mer that permits intersegmental jumps at chain contact 
points (see Fig. 0) due to polymer looping 0, |(| . The 
contour length \x\ stored in a loop between such contact 
points is distributed in 3D like X(x) ~ |x| _1_Q , where 
a = 1/2 for Gaussian chains (8 solvent), and a w 1.2 for 
self-avoiding walk chains (good solvent) 0. 

While non-specifically bound |8|, proteins can diffu- 
sively slide along the DNA backbone in search of their 
specific target site, as lone, as the binding energy does 
not exceed a certain limit [9j . Under overstretching con- 
ditions preventing looping, pure ID sliding search could 
be observed in vitro [l(| • In absence of the stretching 
force, the combination of intersegmental jumps (LF com- 
ponent) and ID sliding may be a good approximation to 
the motion of binding proteins or enzymes along a DNA. 
In general, however, proteins detach to the volume and, 
after a bulk excursion, reattach successively before reach- 
ing the target. This mediation by de- and (re)adsorption 
rates k a s and fc on is described by the Berg-von Hippel 
model sketched in Fig. [I] 11J. We here explore by com- 
bination of analytical and numerical analysis for the first 
time (1) the combination of ID sliding, intersegmental 
transfer and volume exchange, (2) a particle number den- 
sity instead of a single searching protein; and (3) the ex- 
plicit determination of the first arrival to the target, per 



FIG. 1: Search mechanisms in Eq. 

se a non-trivial problem for LFs [l2|. Note that, although 
the process we study is a generic soft matter problem, we 
here adopt the DNA-protein language for illustration. 

Theoretical description. In our description of the 
target search process, we use the density per length 
n(x, t) of proteins on the DNA as the relevant dynam- 
ical quantity (x is the distance along the DNA contour). 
Apart from intersegmental transfer, we include ID slid- 
ing along the DNA with diffusion constant De, protein 
dissociation with rate fc ff and (re)adsorbtion with rate 
&on from a bath of proteins of concentration Tibuik- 

The 

dynamics of n(x, t) is thus governed by the equation |l3j 
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Here, j(t) is the flux into the target located at x = 0. We 
determine the flux j(t) by assuming that the target is per- 
fectly absorbing: n(0, t) = 0. Be initially the system at 
equilibrium, except that the target is unoccupied; then, 
the initial protein density is ng = n(x, 0) = fc on nbuik/fcoff 
[l4|. The total number of particles that have arrived at 
the target up to time t is J{t) = Ldt' j(t'). We derive 
explicit analytic expressions for J(t) in different limiting 
regimes, and study the general case numerically. We use 
J(t) to obtain the mean first arrival time T to the target; 
in particular, to find the value of k a g that minimizes T. 
To proceed, we Laplace and Fourier transform Eq. l(T]l: 

un(q, u) - 2nn 8(q) = - (D B q 2 + D L \q\ a + fc off ) 

Xn(q,u) + 2Trk on n hu i k 5(q)/u ~ j(u), (2) 

with n(q,u) = j£?{n(g, t)}. Integration over q produces 
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FIG. 2: Number of proteins arrived at the target up to t. 
Numerical solutions of Eq. and limiting regimes. 



J(u) = j(u)/u = no/ [m 2 M / o(u)] due to the perfect ab- 
sorption condition n(0,u) = (27r) -1 J dqn(q,u) = 0. Or, 



f*dt' W (t - t')J(t') =n t 



(3) 



in the i-domain. Eq. is a Volterra integral equation 
of the first kind, whose kernel Wq is read off Eq. J2J: 
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that is the Laplace transform of the Green's function 
of n(x, t) at x = 0. Back-transforming, we obtain 
W (t) = (27T)- 1 JZodqexp (-(D B q 2 + D L \q\ a + k oS )t) , 
which has a singularity at t = 0. Eq. © can be solved 
numerically by approximating J(t) by a piecewise linear 
function, converting the integral equation to a linear set 
of equations. Typical plots are shown in Fig. [3 

Eq. (J3J reveals only two relevant time scales: k~g and 
tbl = (D^/D 2 ^) 1 ^ 2 ^ ^. We now obtain asymptotic re- 
sults for small and large (k g + u) , compared to Tg^ • 

fcoff +u 3> r BL : ^ n t ms limit, the denominator of the in- 
tegrand in Eq. Q is dominated either by the term Pro 2 
or by feoff + u for any q; we find the approximation |15| 

Wo(tt) ~ W (u)| Dl=0 = [D B (k oS + u)]- 1 ' 2 /2 . (5) 

feoff + n <C and a > 1 ('connected LFs'): Here, a 
singularity exists at small q as k a s + u — ► 0. For finite 
but small fc ff + w — * 0, the integrand is dominated by 
the DlM" term compared to D B q 2 at small q, yielding 
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feoff + u <C r BL and a< 1 ('disconnected LFs'): Now, 
the singularity is weak, and the integral becomes 



W (u) 
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From these limits, we now infer the behavior of J(t), 
based on Tauberian theorems stating that J(t) at t — > 
is determined by J(u) at u — ► oo, and vice versa 0. We 
discover a rich variety of domains, compare Tab. [J 

(^J.j Sliding search: Desorption from the DNA can be 
neglected for times t <C k~ s . If also t <C tbl, Eq. (J5J 
with fc ff = by inverse Laplace transform leads to 



J{t) ~ (i/rf) 71 , 7i = 1/2, n - tt/(16P s ^ /71 ). 



(8) 



In this regime, only the ID sliding mechanism matters. 

(2.) Fussy Levy search: For tbl € i < fe,^ 1 [a > 1), 
the LF dominates the flux into the target; from Eq. JJjJ, 



J(t) ~ (t/r 2 )^, 72 = 1/a, r 2 = ^/(D^ 72 ), 



(9) 



where C 2 = {T(l + l/a)/[asin(7r/a)]} Q . Now, LFs are 
the overall dominating mechanism. This contrasts: 

(3.) Sloppy Levy search: For a < 1, f > tbl, and 
fe^ ^> tbl, we obtain from Eq. Q 



J(t) 



73 = 1, T 3 = C 3 



a/[2(2-a)]-l/2 



l/(2-a) 1/73 



(io) 



and C 3 = {(2 - a)sin([l - a]ir/[2 - a])}- 1 . For a < 
1, even the step length J dx\x\X(x) diverges, making it 
impossible for the protein to hit a small target solely by 
LF, and local sampling by ID sliding becomes vital. At 
longer times, volume exchange mediated by k Q g enters: 

(4-) Interrupted Levy search: For a > 1 and t ^> 
fe^ 3> tbl we can ignore u in Eq. ©, yielding 



J(i)~(t/T 4 ) 74 , 74 = 1,T4 



1/74 



(11) 

with C4 = l/[asin(7r/a)]. The search on the DNA is 
dominated by LFs, interrupted by 3D volume excursions. 

( 5.) Interrupted sliding search: If tbl 3> k~g , LFs will 
not contribute at any t. Instead we find from Eq. (0 



Jit) ~ (t/r 5 r , 75 = 1, r 5 = l/(2< 2 fe o ff 2 «o /75 



(12) 



for t S> fe^ 1 . This is sliding-dominated search with 3D 
excursions. There exist three scaling regimes for 1 < a < 
2, and two for < a < 1; see Fig. Inland Tab. |l| 

We found that the relevant time scales fe^ 1 and tbl 
together with a give rise to 5 basic search regimes, each 
characterized by an exponent 7, and characteristic time 
scale n. In particular, we saw that J(t) ~ (t/Ti) 7i , 
where the exponent 74 ^ 1 for the first two regimes 
(i = 1,2); in the other cases, we have ji = 1. The 
stable index a characterizing the polymer statistics thus 
strongly influences the overall search. Also note that 
J(t) ~ t when t ^> fc^jf 1 , or t S> tbl and a < 1. The 
characteristic time scales Tj, since J(i) ~ no, scale like 
Tj ~ n Q 1 ^ 7, . As any integral I = dtf(t/Ti) can be 
transformed by s = t/ri to / = Tj / °° dsf(s), it is / ~ Tj. 
Thus, we find that the mean first arrival time scales like 
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TABLE I: Summary of search regimes. See text. 



T = T i /o°° dsexp(-S 7i ) = T^T(l/ji)/ji 



-l/7i 



(see be- 



low) whenever a single of the five regimes dominates the 
integral. In particular, the variation of T~ l with the line 
density uq ranges from quadratic (ID sliding) over rtg in 
the fussy Levy regime (1 < a < 2) to linear, the latter 
being shared by sloppy Levy and bulk mediated search. 
Note that if ID sliding is the sole prevalent mechanism, 
we recover the result T = n/[8D B nl] of Ref. \u%. 

Optimal search. We now address the optimal search 
of the target, i.e., which k Q g minimizes the mean first 
arrival time T when D-q 1 feon, the DNA length L, 
and the total amount of proteins are fixed. To quantify 
the latter, we define ^dna = L/V, where V is the sys- 
tem volume. The overall protein volume density is then 
"■total = !dnaio + "bulk- With the equilibrium condition 
kosrio = fcon«buik, this yields n = n to tai k oa / (feoff + k' on ) 
and a corresponding expression for "bulk; here, k' on — 
korJ-TmA is the inverse average time a single protein 
spends in the bulk solvent before (re)binding to the DNA. 

To extract the mean first arrival time T, we reason as 
follows (compare Ref. [l(j): The total number of proteins 
that have arrived at the target between t' = and t is 
J(t). If N is the overall number of proteins, the probabil- 
ity for an individual protein to have arrived at the target 
is J(t)/N. In the limit of large N, we obtain the survival 
probability of the target (no protein has arrived) as 

P surv (<)= lim (l-J(t)/N) N = exp[-J(t)], (13) 

N^oo 

and thus T = J Q dt P S urv(*)- Note that for LFs, the first 
arrival is crucially different from the first passage [l^ . 

The optimization is complicated by the exponential 
function in Eq. (|13f) . However, both in vitro and in vivo, 
"total (and hence no) is in many cases sufficiently small, 
such that the relevant regime is J(t) oc t (i.e., we can ap- 
proximate Wq(u) by Wo (it = 0)). The mean first arrival 
time in this linear regime becomes 

T = W (u = 0) [(fe off + fc on )/*4J[W/ntotai]. (14) 

We observe a tradeoff in the optimal value fe°ff , that min- 
imizes T: The fraction k' on /(k ff + k' on ) of bound proteins 
shrinks with increasing fe Q g , increasing T. Counteracting 
is the decrease of Wo (it = 0) (and T) with growing fc gf. 

Numerical solutions to the optimal search are shown 
in Fig. [3] for different a. Three different regimes emerge: 

(i) Without LFs {D L -> or D L < D^ 2 {k' an ) 1 - a ' 2 ), 
from Eq. (0 with Wq at it = 0, we obtain fc°S = k' on : 
the proteins should spend equal amounts of time in bulk 
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FIG. 3: Optimal choice of off rate k g as function of the LF 
diffusion constant, from numerical evaluation of Eq. 1141 . The 
circle on the abscissa marks where k"^ becomes (Eq. 1171 1. 



and on the DNA. This corresponds to the result obtained 
for single protein searching on a long DNA HEl. 

Two 

additional regimes unfold for strong LF search, £>l — > oo: 
(ii) For a > 1, where Eq. © applies, we find 



fc t~(a-l)fe 0I 



(15) 



The optimal off rate shrinks linearly with decreasing a. 

(iii) For a < 1, the value of fe°g approaches zero as 
Z?l —> oo: The sloppy LF mechanism becomes so efficient 
that bulk excursions become irrelevant. More precisely, 
for 1/2 < a < 1 as Z>l goes to infinity, 



k opt 
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a 2 sin(2^7rl _ 
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(16) 

At a = 1/2, we observe a qualitative change: When a < 
1/2, the rate k°^ reaches zero for all finite satisfying 



> 



(l + a)sin([l - o\tt/[2 - 
(2-a)sin([l-2a]7r/[2 
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(17) 



Note that when a < 1, the spread of the LF (~ t 1 /") 
grows faster than the number of sites visited (~ t), 
rendering the mixing effect of bulk excursions insignif- 
icant. A scaling argument to understand the crossover 
at a = 1/2 relates the probability density of first arrival 
with the width (~ t x / a ) of the Green's function of an 
LF, j?f a ~ t~ x / a . The associated mean arrival time be- 
comes finite for < a < 1/2, even for the infinite chain 
considered here. 

Discussion. Eq. Q phrases the target search prob- 
lem as a fractional diffusion-reaction equation with point 
sink. This formulation pays tribute to the fact that for 
LFs, the first arrival differs from the first passage: With 
the long-tailed X(x) of an LF, the particle can repeatedly 
jump across the target without hitting, the first arrival 
becoming less efficient than the first passage [l2f . 
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A borderline role is played by the Cauchy case a = 1, 
separating connected (mean jump length (\x\) exists) and 
disconnected LFs. For a < 1, the number of visited sites 
grows slower than the width of the search region and 
the LF mimics the uncorrelated jumps of bulk excursion; 
the latter becomes obsolete for high LF diffusivity Dl- 
Below a = 1/2, bulk excursions already for finite L>l 
are undesirable. A similar observation can be made for 
the scaling of the mean search time T with the Levy 
diffusivity D^, that is proportional to the rate an LF is 
performed: For a > 1 in the interrupted Levy search, 
T ~ fl L ^ Q , whereas T ~ D^ 1 ^ 2 a ' in the sloppy Levy 
search, where a < 1. The Levy component is thus taken 
most profit of when a approaches 1 . Generally, too short 
jumps, leading to local oversampling, as well as too long 
jumps, missing the target, are unfavorable. 

A crucial assumption of the model, analogous to the 
derivation in Ref. [6j , is that on the time scale of the dif- 
fusion process the polymer chain appears annealed; oth- 
erwise, individual jumps are no longer uncorrelated 
Generally, for proteins Z?b is fairly low, and can be fur- 
ther lowered by adjusting the salt condition, so that the 
conditions for the annealed case can be met. Conversely, 
by increasing Db in respect to the polymer dynamics, 
leading to a higher probability to use the same looping- 
induced 'shortcut' repeatedly, it might be possible to in- 
vestigate the turnover from LF motion to 'paradoxical 
diffusion' of the quenched polymer case p| . 

Single molecule studies can probe the dynamics of the 
target search and the quantitative predictions of our 
model |ToL lri| . Monitoring the target finding dynam- 



ics may also be a novel way of investigating soft matter 
properties regarding both polymer equilibrium configu- 
rations, giving rise to a, and its dynamics. With respect 
to the first arrival properties, it would be interesting to 
study the gradual change of the polymer properties from 
self-avoiding behavior in a good solvent to Gaussian chain 
statistics under 9 or dense conditions. 

In a next step, it will be of interest to explore effects 
on the DNA looping behavior due to (a) the occurrence 
of local denaturation bubbles performing as hinges , 
whose dynamics can be understood from statistical ap- 
proaches 0| ; or (b) kinks imprinted on the DNA locally 
by binding proteins. In the presence of different protein 
species, the first arrival method may provide a way to 
probe protein crowding effects to expand existing models 
toward the in vivo situation. 

Conclusion. Our search model reveals rich behavior 
in dependence of the LF diffusivity -Dl and exponent a. 
In particular, we found two crossovers for the optimal 
search that we expect to be accessible experimentally. In 
that sense, our model system is richer than the 2D al- 
batross search model |4|. We note that in the Cauchy 
case a — 1 additional logarithmic contributions are su- 
perimposed to the power laws [2(j. Moreover, long-time 
memory effects may occur in the process; in the protein 
search, e.g., there are indications that both the sliding 
search through stronger protein-DNA interactions |9| and 
the volume diffusion through crowding effects are subdif- 
fusive 0. 

We thank I. M. Sokolov and U. Gerland for discussions. 
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